Mystique: Deconstructing SVG Charts for Layout Reuse

(Proceedings of IEEE VIS 2023)

Chen Chen1    Bongshin Lee2    Yunhai Wang 3   Yunjeong Chang1    Zhicheng Liu 1   
1University of Maryland   2Microsoft Research   3Shandong University  


Abstract:

To facilitate the reuse of existing charts, previous research has examined how to obtain a semantic understanding of a chart by deconstructing its visual representation into reusable components, such as encodings. However, existing deconstruction approaches primarily focus on chart styles, handling only basic layouts. In this paper, we investigate how to deconstruct chart layouts, focusing on rectangle-based ones, as they cover not only 17 chart types but also advanced layouts (e.g., small multiples, nested layouts). We develop an interactive tool, called Mystique, adopting a mixed-initiative approach to extract the axes and legend, and deconstruct a chart’s layout into four semantic components: mark groups, spatial relationships, data encodings, and graphical constraints. Mystique employs a wizard interface that guides chart authors through a series of steps to specify how the deconstructed components map to their own data. On 150 rectangle-based SVG charts, Mystique achieves above 85% accuracy for axis and legend extraction and 96% accuracy for layout deconstruction. In a chart reproduction study, participants could easily reuse existing charts on new datasets. We discuss the current limitations of Mystique and future research directions.




Results:





Figure 1: Each pair shows an existing chart (left) and a new chart created using Mystique (right). The existing charts are produced using a variety of tools, such as D3, Vega-lite, Mascot, PlotDB, Highcharts, and Data Illustrator.



Figure 2: The end-to-end pipeline for reusing an SVG chart to create a new chart in Mystique.



Figure 3: (a) result panel for axis& legend detection; (b) a sample dataset provided by Mystique; (c) the reuse UI consisting of six components.



Figure 4: Chart decomposition process for four different chart segments. The matrix cells store the results from the distance function for each pair of rectangles or groups. Since the matrix is symmetrical, the gray cells do not have to be computed. HS (VS) stands for horizontal (vertical) stack, HG (VG) stands for horizontal (vertical) grid, P stands for packing, -1 means overlapping rectangles, and X means null.



Materials:





Paper: [PDF 2.4M].

Acknowledgements:

Chen Chen and Zhicheng Liu were supported in part by NSF grant IIS2239130, and Yunhai Wang was supported by NSFC (No. 62132017, 62141217) and Shandong Provincial Natural Science Foundation (No. ZQ2022JQ32).